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ON THE POSSIBILITY OF A 



REINFORCEMENT THEORY OF COGNITIVE LEARNING 



By Keiidon Smith 
University of North Carolina at Greensboro 

During the past decade or so, considerable support has arisen for what 
might be called a neo-Tolmanian view of motivation and learning (e.g., Gofer 
& Appley, 1964; Atkinson & Wickens , 1971; Estes, 1971; Logan, 1971; Bindra , 
1972; Bolles, 1972). In general, the elements of that viev are: that learn- 
ing, properly speaking, is the development of new se^uentiaT linkages between 
cognitive events; that such learning occurs by a principle of sheer contiguity; 
that, although cognition is sranething fundamentally different from behavior, 
it acts to guide behavior; and, thus, that changes in cognitive linkages, as 
brought about by learning, are reflected in the modification of overt per- 
formance* 

As has been true since Tolman first espoused it, this sort of approach 
' has a certain phenomenal validity which merits attention and respect. All 

the same, there are a number of considerations which dictate caution in em- 
^ bracing it. Their net effect is to diminish the attractiveness of the theory 
t| and, at the same time, to suggest gently but persistently that, even in 
/nv$?v cognitive learning, reinforcement is a fundamental variable, I should like 

to cite those considerations, and to add to them certain others, hoping to 
j^"^ persuade you that a reinf orcementa 1 theory of cognitive learning Is not, at 
C^^^f least for the moment, entirely out of the question. 

^^j^^ "^^^ first of the critical points to be mentioned is a very general one. 

^1^1^ It is simply that, by any meaningful criterion, cognitive activity is in 

fact a kind of behavior. Cognitive events arise as responses to other cog- 
nitive events or to external stimulation. They serve, in turn, as stimuli 
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to still further cognitive events or to frankly motoric responses. Function- 
s' lly speaking, they are embedded in the ongoing stream of behavior and are 
an intrinsic part of it (Skinner, 1964; Rerlyne, 1965; Homme, 1965; Smith, 
1969). 

General though it may be, this point has some relevance to the present 
discussion. If cognitive events are indeed a form of behavior, they should 
be subject to the same laws as are other forms of behavior. Specifically, 
they should manifest learning in accordance with the same rules. It is there- 
fore significant to note that very little firm evidence has emerged from other 
realms of behavior to indicate that learning can be accomplished by contiguity 
alone (Smith, 1967; Hilgard & Eower, 1966, pp. 109-110). Admittedly, this 
matter is still distinctly moot; but the balance of recent thinking with 
respect to learning- in-general seems to favor reinforcement over simple 
association. A prior i > then, one would be inclined to look for reinforce- 
mental factors in cognitive learning, in particular, and not to expect the 
latter to proceed, uniquely, by association alone. 

We need not, however, settle for a mere ''a priori ** on this point. Thus, 
the second consideration I wish to raise has to do with the failure of pure 
contiguity In cognitive learning, specifically. 

The basic datum, here. Is completely familiar. It consists of the well- 
known fact that a given percept. Image, or thought may be followed closely, 
time and time again, by another one, so that almost unlimited contiguity is 
provided; and yet, the subsequent occurrence of the first cognition will have 
no tendency at all to evoke the second. Textbooks sometimes remark upon this 
fact, and furnish striking Instances of It. Each of us, I Imagine, can think 
of equally striking cases In his own experience. The suggestion commonly 
arising from these circumstances Is that reinforcement is Involved, somehow, 
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In what appears to be learning by pure association (e.g., Berlyne, 1965, p. 100). 

The third and final consideration to be mentioned is one which, again, 
will be recognized as familiar - and one over which, I dare say, we need not 
linger. It is the nagging question as to hov; cognition leads, in the end, to 
behavior - Guthrie's classical problem of the animal "burled in thought." Both 
Bindra (1972) and Bolles (1972) have made serious efforts, recently, to re- 
solve that problem; but it seems to me that even their arguments are not 
entirely persuasive, and that the problem still remains. 

Now, the points so-far mentioned are, on the whole, negative They tend 
mainly to impair a contiguity theory of cognitive learning, and only second- 
arily, somewhat by default, to strengthen a reinforcement theory. The 
question arises, then, as to whether any considerations vrhich are rore positive 
in their support of a reinforcement view can be adduced. 

Interestingly enough, at least some considerations of this sort have been 
on record for a good many years. In 1947, in a brief and rather informal 
paper, R. S. Woodworth gave attention to what he called the "Reenf orcement of 
perception" (Woodworth, 1947). His basic point was one which seems v7orthy 
of notice today, and perhaps even of generalization '^o areas of cognition be- 
yond that of perception. 

As Woodworth himself put it, the heart of his argument was the premise 
that "...perception is always driven by a direct, inherent motive which might 
be called the will to perceive. Whatever ulterior motives might be present 
from time to time, this direct perceptual motive is always present in any 
use of the senses ... 1 1947 , p. 123 j." 

It is true that many, including uyself, would be dubious about: an 
•'inherent .. .will to pexceive." It is to be noted, however, that Woodworth 
was v?illing to recognize "ulterior motives," too; and it would not be difficult 
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to modify his hypothesis somewhat, in the direction thus suggested, to avoid 
its dependence on anything as problematical as an innate need to perceive. 
One could simply recognize the fact that veridical perception has great 
practical utility. In avoiding pain, in finding food or mate, the prelimi- 
nary response of interpreting sensory input is of obvious value, and the act 
of perceiving is thus reinforced over and over again. Granting such continual 
primary reinforcement, we should expect the successful interpretation nf sen- 
sory input to take on and maintain a strong secondary reinf orcementa 1 value 
of its own - and thus, indeed, to become as rewarding as Woodworth says it is. 

It is the main point of the present essay that Woodworth's general line 
of reasoning could be extended from the realm of perceptual learning to that 
of associative learning proper. If the act of perception subserves many 
drives, and can thus be seen as a generalized reinf orcer, the act of 
association Purely does soj, too, and can likewise be seen as a generalized 
retnf orcer. It is useful to the organism in many ways if one percept, image, 
or idea, as a stimulus, evokes another, as a response. Sequences of such 
cognitive S-R*s - ^'associative chains," it you like - enable the organism to 
test, covertly and tentatively, lines of behavior whose overt expression 
might be protracted, effortful, and even painful. It would be strange if 
useful cognitive associations did not begin to carry a secondary-reward value 
of their own; so that, if the environment, or the organism's own thought 
processes, imposed upon the organism a pairing of cognitive responses; and 
if that pairing were of a sort which had been valuable in the past; there 
would arise a secondary reinforcement effect , and a corresponding strengthen- 
ing of the tendency for the one cognitive event subsequently to evoke the 
other. In sum, and in a different idiom, one might say that the organism's 
"storage of information" is an operant event; and Stein (oral statement, 1968; 
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cited in Carroll, 1971) has reportedly advanced, in oral discussion, just such 
a suggestion. 

Now, a relnf orcemental view of cognitive learning, to be at all complete, 
ought to be able to offer a self-consistent answer to the riddle of how 
cognitive activity can guide overt behavior. In the few minutes that remain, 
I should like to suggest one such answer, provisional though it may be. 

Within the frame of reference which has now been developed, then, the 
organism can be pictured as being equipped v/ith a large number of cognitive 
S^R habits. When the organism happens to encounter a problem situation, those 
habits begin to function. The perception of the situation itself evokes a 
learned imaginal response; that, in general, another such response; that, 
another; and so on. The organism thus reels off a series of instrumental 
cognitive responses. Typically, they represent successive overt responses 
making up some course of action; and, sooner or later, one of the imaged 
responses finally evokes an imagined punishment or reinforcement. 

Now> the fascinating thing about such a sequence is that it seems to 
act as a surrogate learning experience. If it ends up with imaged punishment , 
the corresponding sequence of overt responses is not likely to be made. If, 
".^ Nj on the other hand, it ends up with imaged reinforcement , the likelihood of 
y^^"^ actual performance is enhanced - quite possibly to the level at which the 
^17>N^ behavior in question appears explicitly. It is evident, then, that we have 
here a possible explanation for the effects of cognition upon behavior - if 
'^^T^^ it can be regarded as reasonable that an organism can learn from a sequence 



Z/f^^ of its own images. 

It Is tempting, of course, to think of the organism as '"modelling upon 
its own imagery/' But it really seems most likely that, In the end, imagery 
will explain the efficacy of modelling , rather than vice-versa. Hence, a 



real explanation does not seem to lie sir ply in the direction of modelling. 

Another direction does offer at least tentative hope, however, ^ There 
is now a rather general acceptance of the idea that perception and imagery are 
essentially similar functions (cf. Neisser, 1972, i\nd Zikmund , 1972) - the one 
occurring in the presence of the defining stimulus-situation, the other occur** 
ring in the absence of that situation. To image a series of events is thus, 
to some extent, to perceive it. To actually live through a series of events 
however, is also to perceive it. It would accordingly follow that what goes 
on in the nervous system during imagination is rather like x^hat goes on during 
actual experience with the corresponding environmental circumstances. Given 
some latitude in expression, it could be said that to image a series of events 
produces essentially the same neural changes as would be produced by direct 
experience with the events themselves. 

The balance of the argument is perhaps not difficult to anticipate. It 
would suggest that the organism has had an experience essentially equivalent 
to that of behaving overtly and, in the case of interest here, being reinforced. 
Learning has thus occurred, and the effect of that learning has been to link a 
new series of responses v7ith the problem situation. As the animal is, in fact, 
still in that situation, the learned responses are cued, and they are carried 
out. 

In this fashion, then, cognition might possibly give rise to behavior. 
It is x^;orth noting, in closing, that it would do so in a completely determi- 
nate way, in accordance with the ordinary principles of learning. The notion 
that there might be some sort of free decision, on the part of the organism, 

to "use" its c^^-itive experience would be, in this framework, completely 
1 

inappropriate. 

^I wish to thank Ms. Laureen S. Martin and Professors E. A. Lumsden and R. L. 
Shull for their helpful comments on an earlier draft of this oaper. 
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